A Study of Residue Correlation within Protein Sequences and Its Application to Sequence Classification

نویسندگان

  • Christopher Michael Hemmerich
  • Sun Kim
چکیده

We investigate methods of estimating residue correlation within protein sequences. We begin by using mutual information (MI) of adjacent residues, and improve our methodology by defining the mutual information vector (MIV) to estimate long range correlations between nonadjacent residues. We also consider correlation based on residue hydropathy rather than protein-specific interactions. Finally, in experiments of family classification tests, the modeling power of MIV was shown to be significantly better than the classic MI method, reaching the level where proteins can be classified without alignment information.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Investigation of Consecutive Separating Arrangements of Bio active Compounds from Black Tea (Camellia sinensis) Residue

Every year lots of black tea (Camellia sinensis (L.) Kuntze) residue will produce in the factories. These residue are unusable whereas the bio active compounds can be extracted and used in the drag and food industries. Due to mentioned problems, this project was conducted years 2011 - 2012 with the aim to make a study on consecutive isolation of all bio active compounds from tea residu...

متن کامل

GENERATING FUZZY RULES FOR PROTEIN CLASSIFICATION

This paper considers the generation of some interpretable fuzzy rules for assigning an amino acid sequence into the appropriate protein superfamily. Since the main objective of this classifier is the interpretability of rules, we have used the distribution of amino acids in the sequences of proteins as features. These features are the occurrence probabilities of six exchange groups in the seque...

متن کامل

Phylogenetic relationships in Ranunculus species (Ranunculaceae) based on nrDNA ITS and cpDNA trnL-F sequences

The genus Ranunculus L., with a worldwide distribution, is the largest member of the Ranunculaceae. Here, nuclear ribosomal internal transcribed spacer (ITS) sequence data and chloroplast trnLF sequence data were used to analyze phylogenetic relationships among members of the annual and perennial (Group Praemorsa, Group Rhizomatosa, Group Grumosa and Group non-Grumosa) species of Ranunculus...

متن کامل

Molecular Identification of Rare Clinical Mycobacteria by Application of 16S-23S Spacer Region Sequencing

Objective(s) In addition to several molecular methods and in particular 16S rDNA analysis, the application of a more discriminatory genetic marker, i.e., 16S-23S internal transcribed spacer gene sequence has had a great impact on identification and classification of mycobacteria. In the current study we aimed to apply this sequencing power to conclusive identification of some Iranian clinical ...

متن کامل

Streptomyces sp. SCBT Isolated from Rhizosphere Soil of Medicinal Plants is Antagonistic to Pathogenic Bacteria

The Streptomycetes sp. SCBT strain Gen Bank accession number EU143270 isolated from the rhizosphere soil of medicinal plants was collected from the Kolli hills of Tamil Nadu, India. The strain Streptomyces is designated as School of Chemical and Biotechnology (SCBT), capable of inhibiting the growth of a wide range of Gram-negative and Gram-positive bacteria. An almost complete 16S rRNA gene se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2007  شماره 

صفحات  -

تاریخ انتشار 2007